Goto

Collaborating Authors

 model deployment


Privacy-Aware Joint DNN Model Deployment and Partition Optimization for Delay-Efficient Collaborative Edge Inference

Cheng, Zhipeng, Xia, Xiaoyu, Wang, Hong, Liwang, Minghui, Chen, Ning, Fan, Xuwei, Wang, Xianbin

arXiv.org Artificial Intelligence

Edge inference (EI) is a key solution to address the growing challenges of delayed response times, limited scalability, and privacy concerns in cloud-based Deep Neural Network (DNN) inference. However, deploying DNN models on resource-constrained edge devices faces more severe challenges, such as model storage limitations, dynamic service requests, and privacy risks. This paper proposes a novel framework for privacy-aware joint DNN model deployment and partition optimization to minimize long-term average inference delay under resource and privacy constraints. Specifically, the problem is formulated as a complex optimization problem considering model deployment, user-server association, and model partition strategies. To handle the NP-hardness and future uncertainties, a Lyapunov-based approach is introduced to transform the long-term optimization into a single-time-slot problem, ensuring system performance. Additionally, a coalition formation game model is proposed for edge server association, and a greedy-based algorithm is developed for model deployment within each coalition to efficiently solve the problem. Extensive simulations show that the proposed algorithms effectively reduce inference delay while satisfying privacy constraints, outperforming baseline approaches in various scenarios.


In-situ Self-optimization of Quantum Dot Emission for Lasers by Machine-Learning Assisted Epitaxy

Shen, Chao, Zhan, Wenkang, Pan, Shujie, Hao, Hongyue, Zhuo, Ning, Xin, Kaiyao, Cong, Hui, Xu, Chi, Xu, Bo, Ng, Tien Khee, Chen, Siming, Xue, Chunlai, Liu, Fengqi, Wang, Zhanguo, Zhao, Chao

arXiv.org Artificial Intelligence

Traditional methods for optimizing light source emissions rely on a time-consuming trial-and-error approach. While in-situ optimization of light source gain media emission during growth is ideal, it has yet to be realized. In this work, we integrate in-situ reflection high-energy electron diffraction (RHEED) with machine learning (ML) to correlate the surface reconstruction with the photoluminescence (PL) of InAs/GaAs quantum dots (QDs), which serve as the active region of lasers. A lightweight ResNet-GLAM model is employed for the real-time processing of RHEED data as input, enabling effective identification of optical performance. This approach guides the dynamic optimization of growth parameters, allowing real-time feedback control to adjust the QDs emission for lasers. We successfully optimized InAs QDs on GaAs substrates, with a 3.2-fold increase in PL intensity and a reduction in full width at half maximum (FWHM) from 36.69 meV to 28.17 meV under initially suboptimal growth conditions. Our automated, in-situ self-optimized lasers with 5-layer InAs QDs achieved electrically pumped continuous-wave operation at 1240 nm with a low threshold current of 150 A/cm2 at room temperature, an excellent performance comparable to samples grown through traditional manual multi-parameter optimization methods. These results mark a significant step toward intelligent, low-cost, and reproductive light emitters production.


VPI-Mlogs: A web-based machine learning solution for applications in petrophysics

Nguyen, Anh Tuan

arXiv.org Artificial Intelligence

Machine learning is an important part of the data science field. In petrophysics, machine learning algorithms and applications have been widely approached. In this context, Vietnam Petroleum Institute (VPI) has researched and deployed several effective prediction models, namely missing log prediction, fracture zone and fracture density forecast, etc. As one of our solutions, VPI-MLogs is a web-based deployment platform which integrates data preprocessing, exploratory data analysis, visualisation and model execution. Using the most popular data analysis programming language, Python, this approach gives users a powerful tool to deal with the petrophysical logs section. The solution helps to narrow the gap between common knowledge and petrophysics insights. This article will focus on the web-based application which integrates many solutions to grasp petrophysical data.


StraightLine: An End-to-End Resource-Aware Scheduler for Machine Learning Application Requests

Ching, Cheng-Wei, Guan, Boyuan, Xu, Hailu, Hu, Liting

arXiv.org Artificial Intelligence

The life cycle of machine learning (ML) applications consists of two stages: model development and model deployment. However, traditional ML systems (e.g., training-specific or inference-specific systems) focus on one particular stage or phase of the life cycle of ML applications. These systems often aim at optimizing model training or accelerating model inference, and they frequently assume homogeneous infrastructure, which may not always reflect real-world scenarios that include cloud data centers, local servers, containers, and serverless platforms. The key innovation is an empirical dynamic placing algorithm that intelligently places requests based on their unique characteristics (e.g., request frequency, input data size, and data distribution). In contrast to existing ML systems, StraightLine offers end-to-end resource-aware placement, thereby it can significantly reduce response time and failure rate for model deployment when facing different computing resources in the hybrid infrastructure.


Naming the Pain in Machine Learning-Enabled Systems Engineering

Kalinowski, Marcos, Mendez, Daniel, Giray, Görkem, Alves, Antonio Pedro Santos, Azevedo, Kelly, Escovedo, Tatiana, Villamizar, Hugo, Lopes, Helio, Baldassarre, Teresa, Wagner, Stefan, Biffl, Stefan, Musil, Jürgen, Felderer, Michael, Lavesson, Niklas, Gorschek, Tony

arXiv.org Artificial Intelligence

Context: Machine learning (ML)-enabled systems are being increasingly adopted by companies aiming to enhance their products and operational processes. Objective: This paper aims to deliver a comprehensive overview of the current status quo of engineering ML-enabled systems and lay the foundation to steer practically relevant and problem-driven academic research. Method: We conducted an international survey to collect insights from practitioners on the current practices and problems in engineering ML-enabled systems. We received 188 complete responses from 25 countries. We conducted quantitative statistical analyses on contemporary practices using bootstrapping with confidence intervals and qualitative analyses on the reported problems using open and axial coding procedures. Results: Our survey results reinforce and extend existing empirical evidence on engineering ML-enabled systems, providing additional insights into typical ML-enabled systems project contexts, the perceived relevance and complexity of ML life cycle phases, and current practices related to problem understanding, model deployment, and model monitoring. Furthermore, the qualitative analysis provides a detailed map of the problems practitioners face within each ML life cycle phase and the problems causing overall project failure. Conclusions: The results contribute to a better understanding of the status quo and problems in practical environments. We advocate for the further adaptation and dissemination of software engineering practices to enhance the engineering of ML-enabled systems.


The big tech firms want an AI monopoly – but the UK watchdog can bring them to heel John Naughton

The Guardian

"Monopoly," said Peter Thiel, Silicon Valley's answer to Darth Vader, "is the condition of every successful business." This aspiration is widely shared by Gamman, the new acronynm for the Valley's giants – Google, Apple, Microsoft, Meta, Amazon and Nvidia. And the arrival of AI has sharpened the appetite of each for attaining that blessed state before the others get there. One symptom of their anxiety is the way they have been throwing unconscionable amounts of money at the 70-odd generative AI startups that have mushroomed since it became clear that AI was going to be the new new thing. Microsoft reportedly put 13bn (about 10.4bn) into OpenAI, for example, but it was also the lead investor in a 1.3bn funding round for Inflection, Deepmind co-founder Mustafa Suleyman's startup.


Navigating Privacy and Copyright Challenges Across the Data Lifecycle of Generative AI

Zhang, Dawen, Xia, Boming, Liu, Yue, Xu, Xiwei, Hoang, Thong, Xing, Zhenchang, Staples, Mark, Lu, Qinghua, Zhu, Liming

arXiv.org Artificial Intelligence

The internet has enabled an unprecedented free flow and wide distribution of information on a global scale, which largely accelerated the democratization of information, fueling platforms like Wikipedia, YouTube, and StackOverflow. While this facilitated information democratization, it concurrently lowered barriers against unauthorized data use and piracy. The success of Deep Learning (DL) owes significantly to the availability of large-scale datasets available for training DL models [3], predominantly sourced from the internet [4].


Introduction to ML Deployment: Flask, Docker & Locust

#artificialintelligence

You've spent a lot of time on EDA, carefully crafted your features, tuned your model for days and finally have something that performs well on the test set. Now, my friend, we need to deploy the model. After all, any model that stays in the notebook has a value of zero, regardless of how good it is. It might feel overwhelming to learn this part of the data science workflow, especially if you don't have a lot of software engineering experience. Fear not, this post's main purpose is to get you started by introducing one of the most popular frameworks for deployment in Python -- Flask.


Enabling MLOPs in Three Simple Steps

#artificialintelligence

I recently engaged in a project involving the implementation of a multiclass classification prediction system utilising financial transactional data, comprising over 10 million records and over 70 classes. Through this project, I constructed a streamlined end-to-end machine learning operations (MLOPs) infrastructure that is well-suited for this specific use case, while maintaining cost efficiency. The term MLOPs has a broad range of concepts and definitions, as offered by various vendors or solutions. Some focus on aspects such as training traceability and experimental tracking, while others prioritise feature storage or model deployment. In my understanding, MLOPs is the entire end-to-end process, from data extraction to model deployment and monitoring.


evoML Yellow Paper: Evolutionary AI and Optimisation Studio

Li, Lingbo, Kanthan, Leslie, Basios, Michail, Wu, Fan, Adham, Manal, Avagyan, Vitali, Butler, Alexis, Brookes, Paul, Giavrimis, Rafail, Liu, Buhong, Pavlou, Chrystalla, Truscott, Matthew, Voskanyan, Vardan

arXiv.org Artificial Intelligence

Machine learning model development and optimisation can be a rather cumbersome and resource-intensive process. Custom models are often more difficult to build and deploy, and they require infrastructure and expertise which are often costly to acquire and maintain. Machine learning product development lifecycle must take into account the need to navigate the difficulties of developing and deploying machine learning models. evoML is an AI-powered tool that provides automated functionalities in machine learning model development, optimisation, and model code optimisation. Core functionalities of evoML include data cleaning, exploratory analysis, feature analysis and generation, model optimisation, model evaluation, model code optimisation, and model deployment. Additionally, a key feature of evoML is that it embeds code and model optimisation into the model development process, and includes multi-objective optimisation capabilities.